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ABSTRACT 

The False Discovery Rate (FDR) method has recently been described by Miller et al. (2001), 
along with several examples of astrophysical applications. FDR is a new statistical procedure due 
to Benjamini & Hochberg (1995) for controlling the fraction of false positives when performing 
multiple hypothesis testing. The importance of this method to source detection algorithms is 
immediately clear. To explore the possibilities offered we have developed a new task for perform- 
ing source detection in radio-telescope images, Sfind 2.0, which implements FDR. We compare 
Sfind 2.0 with two other source detection and measurement tasks, Inisad and SExtractor, and 
comment on several issues arising from the nature of the correlation between nearby pixels and 
the necessary assumption of the null hypothesis. The strong suggestion is made that implement- 
ing FDR as a threshold defining method in other existing source-detection tasks is easy and 
worthwhile. We show that the constraint on the fraction of false detections as specified by FDR 
holds true even for highly correlated and realistic images. For the detection of true sources, which 
are complex combinations of source-pixels, this constraint appears to be somewhat less strict. It 
is still reliable enough, however, for a priori estimates of the fraction of false source detections to 
be robust and realistic. 

Subject headings: methods: data analysis — methods: statistical — techniques: image processing 



1. Introduction 

Detecting and measuring the properties of ob- 
jects in astronomical images in an automated fash- 
ion is a fundamental step underlying a growing 
proportion of astrophysical research. There are 
many existing tasks, some quite sophisticated, for 
performing such analyses. Regardless of the wave- 
length at which an image has been made however, 
each of these tasks has one thing in common: A 
threshold needs to be defined above which pixels 
will be believed to belong to real sources. 

Defining an appropriate threshold is a complex 
issue and, owing to the unavoidable presence of 
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noise, any chosen threshold will result in some true 
sources being overlooked and some false sources 
measured as real. Varying the chosen threshold 
to one extreme or the other will minimise one of 
these types of error at the expense of maximising 
the other. Clearly choosing a threshold to jointly 
minimise both types of error is not trivial, but even 
more problematic is that it is not even clear that 
one can, a priori, make a well defined estimate 
of the magnitude of each type of error. Typically 
this is done by comparing measured source counts 
with existing estimates for the expected number of 
sources. This is an unsatisfactory solution for sur- 
veys reaching to new sensitivity limits, or at pre- 
viously uninvestigated wavelengths (where there 
can be no estimate of the expected number of 
sources from existing studies), and also for small 
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area imaging where the properties of large popu- 
lations are aflFected by clustering or small number 
statistics. 

Throughout this paper we shall use the terms 
'source-pixel' to mean a pixel in an image which 
is above some threshold, and thus assumed to be 
part of a true source. The term 'source' shall be 
used to mean a contiguous collection of 'source- 
pixels' which corresponds to an actual astronom- 
ical object, a star or galaxy for example, whose 
properties we are interested in measuring. In 
this work we concentrate on radio images, specif- 
ically images produced by radio interferometers. 
We emphasise, though, that all of the conclusions 
presented here are valid for any pixelized map 
where the 'null hypothesis' is known at each pixel. 
Throughout this paper the null hypothesis is taken 
to be the 'sky background' at each pixel (after the 
image is normalized). 

Now consider the following possible criteria ap- 
plied to some chosen threshold: (1) That there be 
no falsely discovered source-pixels; (2) That the 
proportion of falsely discovered source-pixels be 
some small fraction of the total number of pix- 
els (background plus source); (3) That the frac- 
tion of false positives (i.e. the number of falsely 
discovered sources over the total number of dis- 
covered sources) be small. The first of these can 
be achieved by using a very high threshold called 
the Bonferroni threshold (Miller et al. 2001). This 
threshold is rarely used since, although guaran- 
teeing no false detections, it detects so few real 
sources. The second criterion is most often ap- 
plied in astronomy, and can be achieved by sim- 
ply choosing the appropriate significance thresh- 
old. A threshold of 3(J, for example, ensures that 
0.1% of the total number of pixels are falsely dis- 
covcrcid. Unfortunately this is not the same as a 
constraint on the fraction of false detections com- 
pared to total number of detections. This quantity 
is a more meaningful measure to use in defining 
a threshold for the following reason. Consider a 
3(7 threshold in an image composed of 10*" pixels 
(1000 X 1000) and containing only Gaussian noise. 
This would yield, on average, 1000 pixels above the 
threshold. If real sources are also present, these 
1000 pixels appear as false source-pixels, and if 
it happens that only 2000 pixels are measured as 
source- pixels, then half the detections are spuri- 
ous! If many more true source-pixels are present. 



of course, this threshold may be quite adequate. 

The third criterion defines a more ideal thresh- 
old. Such a threshold allows one to specify a priori 
the maximum number of false discoveries, on aver- 
age, as a fraction of the total number of discover- 
ies. Such a method should be independent of the 
source distribution (i.e., it will adapt the thresh- 
old depending on the number and brightness of 
the sources). The False Discovery Rate (FDR) 
method (Benjamini & Hochberg 1995; Benjamini 
& Yekutieli 2001; Miller et al. 2001) does precisely 
this, selecting a threshold which controls the frac- 
tion of false detections. We have implemented the 
FDR technique in a task for detecting and measur- 
ing sources in images made with radio telescopes. 

Radio images were chosen for the current anal- 
ysis for several reasons, including previous experi- 
ence at coding of radio source detection tasks, but 
also since the conservative nature of constructing 
many radio source catalogues allows the value of 
this method to be emphasised. Traditionally ra- 
dio source catalogues are constructed in a fash- 
ion aimed at minimising spurious sources, accom- 
plished by selecting a very conservative threshold, 
which is usually 5a or even 7a. This is partly 
driven by the difficulty of completely removing 
residual image artifacts such as sidelobes of bright 
sources, even after applying very sophisticated 
image reconstruction methods, and the desire to 
avoid classifying these as sources. Many of the 
issues surrounding radio source detection are de- 
scribed by White et al. (1997). While such a con- 
servative approach may minimise false detections, 
it has the drawback of not detecting large numbers 
of real, fainter, sources important in many studies 
of the sub-mjy and /ijy populations, for example. 
An FDR defined threshold may allow many more 
sources to be included in a catalogue while provid- 
ing a quantitative constraint to be placed on the 
fraction of false detections. 

Table 1 briefly describes the algorithms used in 
some common radio source detection and measure- 
ment tasks. There is much similarity in the selec- 
tion of tasks available within the two primary im- 
age analysis packages, AIPS (Astronomical Image 
Processing System^) and miriad (Multichannel 
Image Reconstruction, Image Analysis and Dis- 
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play**), as alluded to in the summaries given in 
this table, although the specifics of each such task 
certainly differ to a greater or lesser extent. In ad- 
dition to these packages, AIPSH — h also has a image 
measurement task, imagefitter, similar in concept 
to the Imfit tasks in AlPS and MIRIAD. A stand- 
alone task, SExtractor (Bertin & Arnouts 1996), 
is used extensively in image analysis for detect- 
ing and measuring objects, primarily in images 
made at optical wavelengths. Given the flexibil- 
ity of the SExtractor code provided by the many 
user-definable parameters, it is possible to use this 
task also for detecting objects in radio images. 

In the analysis below we compare the effective- 
ness of Sfind 2.0, which runs under the MIRIAD 
package, with that of Imsad (also a MiRiAD task, 
and which employs essentially the same algorithm 
as SAD and VSAD in aips) and SExtractor. Sec- 
tion 2 describes the operation of the Sfind 2.0 task 
and the implementation of the FDR method. Sec- 
tion 3 describes the monte-carlo construction of 
artificial images on which to compare Sfind 2.0, 
Imsad and SExtractor, the tests used in the com- 
parison and their results. Subsequent implemen- 
tation using a portion of a real radio image is also 
presented. Section 4 discusses the relative merits 
of each task, the effectiveness of the FDR method, 
and raises some issues regarding the validity of the 
null hypothesis (i.e. the background level) and the 
correlation between neighbouring pixels. Section 5 
presents our conclusions along with the strong sug- 
gestion that implementing FDR as a threshold 
defining method in other existing source detection 
tasks is worthwhile. 

2. Sfind 2.0 

Here we describe the algorithm used by Sfind 2.0, 
which implements FDR for identifying a detection 
threshold. We include the version number of the 
task simply to differentiate it from the earlier ver- 
sion of Sfind, also implemented under MIRIAD, 
as the source detection algorithm is significantly 
different. Subsequent revisions of Sfind will con- 
tinue to use the FDR thresholding method. The 
elliptical Gaussian fitting routine used to measure 
identified sources has not changed, however, and 
is the same as that used in Imfit and Imsad. An 
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example of the use of the earlier version of Sfind 
can be found in the source detection discussion of 
Hopkins et al. (1999). 

The first step performed is to normalise the 
image. A Gaussian is fit to the pixel histogram 
in regions of a user-specified size to establish the 
mean and standard deviation, a, for each region. 
Then for each region of the image, the mean is 
subtracted and the result is divided by a. Ideally 
this leaves an image with uniform noise charac- 
teristics, defined by a Gaussian with zero mean 
and unit standard deviation. In practice the finite 
size of the regions used may result in some non- 
uniformity, although a judicious choice of size for 
these regions should minimise any such effect. We 
note that radio interferometer images often con- 
tain image artifacts such as residual sidclobes aris- 
ing from the image-processing, sampling effects, 
and so on. With adequate sampling these effects 
should be statistically random with zero mean, 
and simply add to the overall image noise. 

Next the FDR threshold is calculated for the 
whole image. The null hypothesis is taken to be 
that each pixel is drawn from a gaussian distribu- 
tion with zero mean and unit standard deviation. 
This corresponds to the 'background pixels'. In 
the absence of real sources, each pixel has a prob- 
ability p, (which varies with its normalised inten- 
sity) , of being drawn from such a distribution. In 
images known to contain real sources, a low p- 
value for a pixel (calculated under the assumption 
that no sources are present) is often used as an 
indicator that it is a 'source-pixel.' The p- values 
for all N pixels in the image are calculated and 
ordered. The threshold is then defined by plotting 
the ordered values as a function of i/N (where 
N is the total number of pixels and i is the in- 
dex, from 1 to N) and finding the p-value, pcut 
say, corresponding to the last point of intersection 
between these and a line of slope a/cN. Here a is 
the maximum fraction of falsely detected source- 
pixels to allow, on average (over multiple possible 
instances of the noise) , and ca? = 1 if the statistical 
tests are fully independent (the pixels are uncor- 
related). If the tests are dependent (the pixels are 
correlated) then 

^ 1 

CN = Y.-- (1) 

i=l * 



3 



Table 1 

Existing radio source detection and measurement tasks. 



AlPS MIRIAD Short description 



IMFIT/ JMFIT Imfit fits multiple Gaussians to all pixels in a defined area. 

SAD/VS AD/HAPP Y Imsad defines 'islands' encompassing pixels above a user 

defined threshold, and fits multiple Gaussians within these areas. 
Sfind 2.0 defines threshold using FDR to determine pixels belonging 
to sources, and fits those by a Gaussian. 



Since most radio images (and indeed astronomical 
images in general) show some degree of correlation 
between pixels, but tend not to be fully correlated, 
i.e. the intensity of a given pixel is not influenced 
by that of every other pixel, we have chosen to 
take an intermediate estimate for cjm reflecting the 
level of correlation present in the image. This is re- 
lated to the synthesised beam size, or point-spread 
function (PSF). If n is the (integer) number of pix- 
els representing the PSF we define cjv = X]"=i j- 
This will be discussed further in Section 3.1. A di- 
agrammatic example of the threshold calculation 
is shown in Figure 1. It becomes obvious from this 
Figure that increasing or decreasing the value cho- 
sen for a corresponds to increasing or decreasing 
the resulting p- value threshold, Pcut; and the num- 
ber of pixels thus retained as 'source-pixels'. The 
FDR formalism ensures that the average fraction 
of false 'source-pixels' will never exceed a. As de- 
scribed in Miller et al. (2001), this explanation of 
implementing FDR does not explain or justify the 
validity of FDR. The reader is referred to Miller 
et al. (2001) for a heuristic justification, and to 
Bcnjamini & Hochberg (1995) and Bcnjamini & 
Yekutieli (2001) for the detailed statistical proof. 

Finally, once the FDR threshold is defined, the 
pixels with p < Pcut, corresponding to 'source- 
pixels', can be analysed. Each of the source-pixels 
are investigated in turn as follows. A hill-climbing 
procedure starts from the source-pixel and finds 
the nearest local maximum from among the con- 
tiguous source-pixels. From this local peak, a 
collection of contiguous, monotonically decreasing 
pixels are selected to represent the source. At 
this point, it is possible to either (1) use all of 



the pixels around the peak which satisfy these cri- 
teria, or (2) use only those which are themselves 
above the FDR threshold. The latter is the de- 
fault operation of Sfind 2.0, but the user can spec- 
ify an option for choosing the former method as 
well. The former method, which allows pixels be- 
low the FDR threshold to be included in a source 
measurement, may be desirable for obtaining more 
reasonable source parameters for sources close to 
the threshold. In either case, the resulting collec- 
tion of source-pixels is fit in a least-squares fash- 
ion by a two-dimensional elliptical gaussian. If the 
fitting procedure does not converge, the source is 
rejected. This is typically the case when the po- 
tential source consists of too few pixels to well- 
constrain the fit. It is likely that most such rejec- 
tions will be due to noise-spikes, which typically 
contain a small number of pixels, although some 
may be due to real sources lying just below the 
threshold such that only a few true source-pixels 
appear above it. If the fit is successful the source 
is characterised by the fitted gaussian parameters, 
and the pixels used in this process are flagged as 
already 'belonging' to a source to prevent them 
from being investigated again in later iterations of 
this step. On completion, a source catalogue is 
written by the task, and images showing (1) the 
pixels above the FDR threshold, and (2) the nor- 
malised image, may optionally be produced. 

3. Task comparisons 

To compare the effectiveness of Sfind 2.0, Im- 
sad and SExtractor, one hundred artificial images 
361 X 391 pixels in size were generated. Each 
of these contained a different instance of random 
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gaussian noise and the same catalogue of 72 point 
sources with known properties (position and in- 
tensity). The intensity distribution of the sources 
spans a Httlc more than 2 orders of magnitude, 
ranging from somewhat fainter than the noise level 
to well above it. Many fewer bright sources were 
assigned than faint sources, in order to produce 
a realistic distribution of source intensities. The 
artificial images were convolved with a gaussian 
to mimic the effects of a radio telescope PSF. The 
test images have 2" pixels, and the gaussian chosen 
to represent the PSF has FWHMa of 11'.'73 x 5'.'52 
with a position angle of 16.3°. The (convolved) ar- 
tificial sources in the absence of noise can be seen 
in Figure 2, along with one of the test images. 

On each image, Sfind 2.0 was nm with a — 0.01, 
0.05 and 0.1, and the resulting lists of detected 
sources compared with the input catalogue. By 
way of an example, the sources detected in a sin- 
gle test of Sfind 2.0 are shown in Figure 3, which 
indicates by an ellipse the location, size and po- 
sition angle of each detected source. This exam- 
ple demonstrates the difficulty of detecting faint 
sources, and the ability of noise to mimic the char- 
acteristics of faint sources. 

3.1. Source pixel detection 

Miller et al. (2001) examined the simplest pos- 
sible scenario, source-pixels placed on a regular 
grid in the presence of uncorrelated gaussian noise. 
Here we investigate a much more realistic situ- 
ation. The 'source-pixels' now lie in contiguous 
groups comprising 'real' sources in the sense that 
the whole image (background and sources) has 
been convolved with a PSF. The number of pixels 
in each source above a certain threshold will vary 
depending on the intensity of the source. We now 
confirm that the FDR method works consistently 
on these realistic images. 

To verify the reliability of the FDR defined 
threshold, the number of pixels detected above the 
FDR threshold in each test were recorded along 
with the number which were unassociated with 
any true source. The distribution of this fraction 
of false FDR pixels should never exceed the value 
specified for a, and this can be seen in the his- 
togram in Figure 4. This Figure also shows how 
the distribution of falsely detected pixels changes 
with the form chosen for cn, emphasising that the 
assumption of complete independence of the pix- 



els is not justified (as expected), but neither is the 
image fully correlated, evidenced by the conser- 
vative level of false detections seen under this as- 
sumption. Our choice for the form of , which is 
not in fact a result of the rigorous statistical proof, 
appears to be a feasible and reliable intermediate 
for such 'partially correlated' images. 

3.2. Source detection 

The FDR formalism ensures that the average 
fraction of falsely detected pixels will be less than 
a. The connection between numbers of pixels and 
numbers of sources is complex, however, and the 
same criterion cannot be said to be true for the 
fraction of falsely detected sources. The num- 
ber of source-pixels per source will vary accord- 
ing to both instrumental effects, such as the sam- 
pling and the PSF, as well as intrinsic source sizes 
compared to the instrumental resolution and the 
source brightnesses compared to the noise level in 
an image. Even if all sources are point-like, and 
hence should appear in the image as a PSF, the 
number of source-pixels above a given threshold 
for a given source depends on its brightness, and 
the number of source-pixels per source would not 
be expected to be constant. To investigate the 
effect of this complex relation, we explore empiri- 
cally the results of applying FDR thresholding to 
our simulated images. The fraction of falsely de- 
tected sources in each image, as well as the frac- 
tion of true sources not detected, are shown in Fig- 
ure 5 as distributions for each tested value of a. 
By construction, a number of the artificial sources 
have intensities comparable to or lower than the 
noise level in the images, so not all sources will be 
able to be recovered in every image. This is re- 
flected in the fact that somewhat more than 5% of 
sources are missed (by all tasks tested) even with 
very liberal thresholds. 

The histograms in Figure 5(a) show that for 
a = 0.1, where up to 10% of pixels could be ex- 
pected to be false, the fraction of false sources is 
not much more. The result for a = 0.05, is also 
quite reasonable, although for a — 0.01 the out- 
liers are further still, relatively speaking, from the 
expected fraction. While the strict constraint ap- 
plicable to the fraction of false source-pixels no 
longer holds for false sources, it still seems to be 
quite a good estimator. For the case where only 
the peak pixel is required to be above the FDR 
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threshold, Figure 5(c), the fractions of falsely de- 
tected sources are not so strongly constrained. For 
a = 0.1, the fraction may be almost twice that ex- 
pected. In both cases, although with greater relia- 
bility in the former, this allows the FDR method to 
provide an estimate of the fraction of false sources 
to expect. Even though the constraint may not 
be rigorous, and clearly the estimate will be much 
more reliable in the case where all source-pixels 
arc required to be above the FDR threshold, the 
FDR method allows a realistic a priori estimate of 
the fraction of false detections to be made. This 
feature is not possible with the simple assumption 
of, say, a 5a threshold. 

To test Imsad and SExtractor in the same fash- 
ion as Sfind 2.0, a choice of threshold value as a 
multiple of the image noise level (cr) was required. 
Simply selecting a canonical value of 3a, 5a or 7a, 
for example, would complicate the comparison be- 
tween these tasks and Sfind 2.0, as this would be 
testing not only different source measurement rou- 
tines but also potentially clifFcrcnt thresholds. The 
values of a selected for testing Sfind 2.0 result in 
threshold levels which correspond approximately, 
(since the noise level varies minimally from image 
to image), to 4.1cr, 3.6a and 3.3cr. A 5a threshold 
in these simulations would correspond to a value 
of a « 0.0001. 

In a brief aside, it should be emphasised that 
this particular correspondence between a choice of 
a and a particular cr-threshold is only valid for the 
noise and source characteristics of the images used 
in the present simulations. For images with differ- 
ent noise levels or different source intensity distri- 
butions, any particular value of a will correspond 
to some different multiple of the local noise level. 
The primary advantage of specifying an FDR a 
value over choosing a 5a threshold, say, is that 
the FDR threshold is adaptive. The FDR thresh- 
old will assume a different value depending on the 
source intensity distribution relative to the back- 
ground. This point is made very strongly in Miller 
et al. (2001) in the diagrams of their Figure 4. 
As a specific example in the context of the cur- 
rent simulations, we investigated additional simu- 
lated images containing the same noise as in the 
current simulations but containing sources having 
very different intensity distributions. We chose 
one intensity distribution such that every source 
was 10 times brighter than in the current simula- 



tions and one such that every source was 10 times 
fainter. In the brighter case, the FDR threshold 
for a = 0.01, ensuring that on average no more 
than 1% of source-pixels would be falsely detected, 
corresponded not to 4. la, but to about 3.8cr. The 
reason here is that as the source distribution be- 
comes brighter, many more pixels will have low 
p-values. To retain the constant fraction of false 
pixels more background pixels must be included, 
so the threshold becomes lower. In the fainter 
case, where the artificial sources are very close in 
intensity to the noise level, the same FDR thresh- 
old corresponds to about 4.4(7. Of course in this 
case very few sources are detected, for obvious rea- 
sons, but the same constraint on the fraction of 
falsely detected pixels applies. Here fewer pixels 
will have low values, thus fewer background pix- 
els are allowed, maintaining a constant fraction of 
false pixels, and the threshold increases. In the 
brighter case the simple assumption of, say, a 4a 
threshold would give a lower rate of false pixels, 
while in the fainter case it would give a higher frac- 
tion. The importance of these examples is to em- 
phasise that FDR provides a consistent constraint 
on the fraction of false detections in an adaptive 
way, governed by the source intensity distribution 
relative to the background, which cannot be re- 
produced by the simple assumption of, for exam- 
ple, a 4cr threshold. While the source distributions 
in most astronomical images typically lie between 
the two extremes presented for this illustration, 
the adaptability of the FDR thresholding method 
still presents itself as an important tool. 

Returning now to the comparison between 
Sfind 2.0, Imsad and SExtractor, the Imsad and 
SExtractor thresholds were set to correspond to 
those derived from the a values used in Sfind 2.0. 
The distributions of falsely detected and missed 
sources were similarly calculated. These are shown 
in Figures 5(e) to 5(h). One of the features of SEx- 
tractor is the ability to set a minimum number of 
contiguous pixels required before a source is con- 
sidered to be real, and obviously the number of de- 
tected sources varies strongly with this parameter. 
After some experimentation we set this parame- 
ter to 7 pixels, as this resulted in a distribution 
of false detections most similar to that seen with 
Sfind 2.0, for the case where only pixels above the 
FDR threshold are used in source measurements. 
Values larger than 7 reduced the number of false 
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detections at the expense of missing more true 
sources, and vice-versa. From this comparison, 
Sfind 2.0 appears to miss somewhat fewer of the 
true sources than SExtractor when SExtractor is 
constrained to the same level of false detections. 
This is also true if SExtractor is constrained to a 
similar distribution of false detections as obtained 
by Sfind 2.0 for the case where only the peak pixel 
is required to be above the FDR threshold (cor- 
responding to 4 contiguous pixels). In both cases, 
allowing SExtractor to detect more true sources by 
lowering the minimum pixel criterion introduces 
larger numbers of false detections. Sfind 2.0 also 
performs favourably compared to Imsad. While 
Sfind 2.0 seems to miss a few percent more real 
sources than Imsad, Imsad seems to detect many 
more false sources than Sfind 2.0 in either of its 
source-measurement modes. 

Of primary importance in source measurement 
is the reliability of the source parameters mea- 
sured. Figure 6 shows one example from the 
one hundred tests comparing the true intensities 
and positions of the artificial sources with those 
measured by Sfind 2.0. Similar results are ob- 
tained with Imsad, which uses the same gaus- 
sian fitting routine. As expected, the measured 
values of intensity and position become less re- 
liable as the source intensity becomes closer to 
the noise, although they are still not unreason- 
able. A comprehensive analysis of gaussian fitting 
in astronomical applications has been presented 
by Condon (1997), and the results of the gaus- 
sian fitting performed by Sfind 2.0 (and Imsad) 
are consistent with the errors expected. Addition- 
ally, the assumption that a source is point-like, 
or only slightly extended, and thus well fit by a 
two-dimensional elliptical gaussian, is clearly not 
always true. Complex sources in radio images, 
as in any astronomical image, present difiiculties 
for simple source detection algorithms such as the 
ones investigated here. It is not the aim of the 
current analysis to address these problems, except 
to mention that the parameters of such sources 
measured under the point-source assumption will 
suff'er from much larger errors than indicated by 
the results of the gaussian fitting. 

As a final test, Sfind 2.0 was used to identify 
sources in a real radio image, a small portion of the 
Phoenix Deep Survey (Hopkins et al. 1999). This 
image contains sources with extended and complex 



morphologies as well as point sources. Figure 7 
shows the results, with Sfind 2.0 reliably identi- 
fying point source and extended objects as well 
as the components of various blended sources and 
complex objects. 

4. Discussion 

The main aims of this analysis have been to (1) 
investigate the implementation of FDR threshold- 
ing to an astronomical source detection task, and 
(2) compare the rates of missed and falsely de- 
tected sources between this task and others com- 
monly used. Implementation of the FDR thresh- 
olding method is very straightforward, (evidenced 
by the seven step IDL example of Miller et al. 
(2001), in their appendix B). The FDR method 
performs as expected in providing a statistically 
reliable estimate of the fraction of falsely detected 
pixels. Performing source detection on a set of pix- 
els introduces the transformation of pixel groups 
into sources. This ultimately results in the strong 
constraint on the false fraction of FDR-selected 
pixels becoming a less rigorous, but still use- 
ful and reliable estimate of the fraction of false 
sources. As already mentioned, this is still a more 
quantitative statement than can be made of the 
rate of false sources in the absence of the FDR 
method. It is possible that rigorous quantitative 
constraints on the fraction of false source detec- 
tions may be obtained empirically for individual 
images or surveys. By performing Monte-Carlo 
source detection simulations with artificial images 
having noise properties similar to the ones under 
investigation, the trend of falsely detected sources 
with a may be able to be reliably characterised. 
This has not been examined in the current analy- 
sis, but will be included in subsequent work with 
Sfind 2.0. While constraints on the fraction of 
falsely detected sources may be possible, neither 
FDR nor any other thresholding method provides 
constraints on the numbers of true sources remain- 
ing undetected. 

A study is also ongoing into whether a more 
sophisticated FDR thresholding method for defin- 
ing a source may be feasible. This would in- 
volve examining the combined size and brightness 
properties of groups of contiguous pixels to define 
a new value. This would represent the likeli- 
hood that such a collection of pixels comes from a 
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'background distribution' or null-hypothesis, cor- 
responding to the properties exhibited by noise in 
various types of astronomical images. Using this 
new p-valuc an FDR threshold, now in the size- 
brightness parameter space, could be applied for 
defining a source catalogue. Clearly much care will 
need to be taken to avoid discriminating against 
true sources which may lie in certain regions of 
the size-brightness plane, such as low surface- 
brightness galaxies. The assumption in many ex- 
isting source-detection algorithms that sources are 
point-like already suffers from such discrimination, 
though, so even if such bias is unavoidable, some 
progress may still be achievable. 

The form assumed for cn in this analysis is 
not in fact a rigorous result of the formal FDR 
proof. Instead it is a 'compromise' estimate that 
seems mathematically reasonable, and gives reli- 
able results in practice. To be strictly conserva- 
tive, the form of cjv given by equation 1 should be 
adopted to ensure that the fraction of falsely de- 
tected 'source-pixels' is strictly less than a. This 
rigorous treatment, however, is dependent on the 
number of pixels present in the image. Now con- 
sider analysis of a sub-region within an image. As 
the size of this sub-region is changed, the number 
of pixels being considered similarly changes, and 
this will have the effect of changing the threshold 
level, and the resulting source catalogue. This, 
perhaps non- intuitive, aspect of the FDR formal- 
ism is the adaptive mechanism which allows it to 
be rigorous in constraining the fraction of false de- 
tections. 

There are additional complicating factors which 
must be taken into account when performing 
source detection. The null hypothesis assumed for 
the FDR method (and indeed for all the source de- 
tection algorithms) is that the background pixels 
have intensities drawn from a gaussian distribu- 
tion (or other well-cliaractcriscd statistical distri- 
bution such as a poissonian). This is not strictly 
true for radio images, where residual image pro- 
cessing artifacts may affect the noise properties, 
albeit at a low level. In all cases, such deviations 
will result in a larger fraction of false pixels than 
expected, some of which may be clumped in a 
fashion sufficient to mimic, and be measured as, 
sources, thus increasing the fraction of falsely de- 
tected sources. This comment is simply to serve 
as reminder to use caution when analysing images 



with complex noise properties. 

5. Conclusions 

We have implemented the FDR method in an 
astronomical source detection task, Sfind 2.0, and 
compared it with two other tasks for detecting 
and measuring sources in radio telescope images. 
Sfind 2.0 compares favourably to both in the frac- 
tions of falsely detected sources and undetected 
true sources. The FDR method reliably selects 
a threshold which constrains the fraction of false 
pixels with respect to the total number of 'source- 
pixels' in realistic images. The fraction of falsely 
detectcid sources is not so strongly constrained, al- 
though quantitative estimates of this fraction are 
still reasonable. More investigation of the relation- 
ship between 'source-pixels' and 'sources' is war- 
ranted to determine if a more rigorous constraint 
can be established. With the ability to quantify 
the fraction of false detections provided by the 
FDR method, we strongly recommend that it is 
worthwhile implementing as a threshold defining 
method in existing source detection tasks. 
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Fig. 1. — A graphical example of how the FDR threshold is calculated. This diagram shows only the relevant 
portion of the full graph (which has abscissa spanning 0—1). The points show p- values of the pixels from 
one of the artificial images. The line has a slope of ajcj^ with a = 0.05 and Cn « 3.495, corresponding to 
a PSF area covering 18 pixels. The last intersection point gives Pcut = 2.19 x 10~^. Hence all pixels with 
p < 2.19 X 10~^ are considered 'source-pixels'. 
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Fig. 2. — Example artificial images. Left: Artificial sources only. Right: Artificial sources in the presence 
of noise as used in the simulations, emphasising that real sources close to or below the noise level become 
diflacult or impossible to detect, even visually. 
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Fig. 3. Example artificial images, showing objects detected as sources by Sfind 2.0. The noiseless image 
(left) is shown for reference to make it clear which objects have been correctly detected or missed in this 
instance. 
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Fig. 4. — Fraction of falsely detected pixels for a = 0.1 assuming different forms for cat. For the uncorrelated 
assumption, cn = 1- For fully correlated, cat = X^iLi For 'partially correlated,' = X^ILi 1/*) where 
n represents the number of pixels covering the 'correlation size' of the image, corresponding in our tests to 
the PSF size. For both the fully correlated and 'partially correlated' cases it can bo scon empirically that 
(FDR) < a (where the angle brackets represent an ensemble average over replications of the data). 
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Fig. 5. — Comparison of different source detection codes. The fractions of sources falsely detected (left) or 
missed (right) are shown for each of the tested tasks, (a) and (b) show results for Sfind 2.0 in its default 
mode, where all source-pixels are required to be above the FDR-threshold, (c) and (d) show results for 
Sfind 2.0 when only the peak-pixel is required to lie above the FDR threshold. 
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(f) Imsad 
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(h) SExtractor 
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Fig. 5. — Continued. Fractions of falsely detected (left) and missed (right) sources for Imsad (top) and 
SExtractor (bottom). The SExtractor results are based on setting a minimum requirement of 7 pixels for a 
source to be detected. 
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Fig. 6. — Comparison of measured source parameters with the true values. Upper left: The ratio of measured 
to true flux density is shown as a function of the flux density. Upper right: Scatter plot of position errors. 
Lower left: Distribution of errors in RA. Lower right: Distribution of errors in Dec. The gaussians over the 
two histograms have a = (f.'l, and are indicative rather than fitted to the distributions. 
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Fig. 7. — An example of a real radio image (a portion of the Phoenix Deep Survey). The marked sources 
have been detected using Sfind 2.0 and an FDR threshold corresponding to a = 0.01. There are 69 detected 
sources, and the FDR method suggests that less than 1 of these are likely to be falsely detected. 
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